15 research outputs found

    Markov Decision Process Based Energy-Efficient On-Line Scheduling for Slice-Parallel Video Decoders on Multicore Systems

    Get PDF
    We consider the problem of energy-efficient on-line scheduling for slice-parallel video decoders on multicore systems. We assume that each of the processors are Dynamic Voltage Frequency Scaling (DVFS) enabled such that they can independently trade off performance for power, while taking the video decoding workload into account. In the past, scheduling and DVFS policies in multi-core systems have been formulated heuristically due to the inherent complexity of the on-line multicore scheduling problem. The key contribution of this report is that we rigorously formulate the problem as a Markov decision process (MDP), which simultaneously takes into account the on-line scheduling and per-core DVFS capabilities; the power consumption of the processor cores and caches; and the loss tolerant and dynamic nature of the video decoder's traffic. In particular, we model the video traffic using a Direct Acyclic Graph (DAG) to capture the precedence constraints among frames in a Group of Pictures (GOP) structure, while also accounting for the fact that frames have different display/decoding deadlines and non-deterministic decoding complexities. The objective of the MDP is to minimize long-term power consumption subject to a minimum Quality of Service (QoS) constraint related to the decoder's throughput. Although MDPs notoriously suffer from the curse of dimensionality, we show that, with appropriate simplifications and approximations, the complexity of the MDP can be mitigated. We implement a slice-parallel version of H.264 on a multiprocessor ARM (MPARM) virtual platform simulator, which provides cycle-accurate and bus signal-accurate simulation for different processors. We use this platform to generate realistic video decoding traces with which we evaluate the proposed on-line scheduling algorithm in Matlab

    A Real-Time Compressed Sensing-Based Personal Electrocardiogram Monitoring System

    Get PDF
    Wireless body sensor networks (WBSN) hold the promise to enable next-generation patient-centric tele-cardiology systems. A WBSN-enabled electrocardiogram (ECG) monitor consists of wearable, miniaturized and wireless sensors able to measure and wirelessly report cardiac signals to a WBSN coordinator, which is responsible for reporting them to the tele-health provider. However, state-of-the-art WBSN-enabled ECG monitors still fall short of the required functionality, miniaturization and energy efficiency. Among others, energy efficiency can be significantly improved through embedded ECG compression, which reduces airtime over energy-hungry wireless links. In this paper, we propose a novel real-time energy-aware ECG monitoring system based on the emerging compressed sensing (CS) signal acquisition/compression paradigm for WBSN applications. For the first time, CS is demonstrated as an advantageous real-time and energy-efficient ECG compression technique, with a computationally light ECG encoder on the state-of-the-art ShimmerTM wearable sensor node and a realtime decoder running on an iPhone (acting as a WBSN coordinator). Interestingly, our results show an average CPU usage of less than 5% on the node, and of less than 30% on the iPhone

    Online Energy-Efficient Task-Graph Scheduling for Multicore Platforms

    Get PDF
    Numerous Directed-Acyclic Graph (DAG) schedulers have been developed to improve the energy efficiency of various multi-core platforms. However, these schedulers make a priori assumptions about the relationship between the task dependencies, and they are unable to adapt online to the characteristics of each application without offline profiling data. Therefore, we propose a novel energy-efficient online scheduling solution for the general DAG model to address the two aforementioned problems. Our proposed scheduler is able to adapt at runtime to the characteristics of each application by making smart foresighted decisions, which take into account the impact of current scheduling decisions on the present and future deadline miss rates and energy efficiency. Moreover, our scheduler is able to efficiently handle execution with very limited resources by avoiding scheduling tasks that are expected to miss their deadlines and do not have an impact on future deadlines. We validate our approach against state-of-the-art solutions. In our first set of experiments, our results with the H.264 video decoder demonstrate that the proposed low-complexity solution for the general DAG model reduces the energy consumption by up to 15% compared to an existing sophisticated and complex scheduler that was specifically built for the H.264 video decoder application. In our second set of experiments, our results with different configurations of synthetic DAGs demonstrate that our proposed solution is able to reduce the energy consumption by up to 55% and the deadline miss rates by up to 99% compared to a second existing scheduling solution. Finally, we show that our DFM and scheduler have low complexities on a real mobile platform and we show that our solution is resilient to workload prediction errors by using different estimator accuracies

    Big-Data Streaming Applications Scheduling Based on Staged Multi-armed Bandits

    Get PDF
    Several techniques have been recently proposed to adapt Big-Data streaming applications to existing many core platforms. Among these techniques, online reinforcement learning methods have been proposed that learn how to adapt at run-time the throughput and resources allocated to the various streaming tasks depending on dynamically changing data stream characteristics and the desired applications performance (e.g., accuracy). However, most of state-of-the-art techniques consider only one single stream input in its application model input and assume that the system knows the amount of resources to allocate to each task to achieve a desired performance. To address these limitations, in this paper we propose a new systematic and efficient methodology and associated algorithms for online learning and energy-efficient scheduling of Big-Data streaming applications with multiple streams on many core systems with resource constraints. We formalize the problem of multi-stream scheduling as a staged decision problem in which the performance obtained for various resource allocations is unknown. The proposed scheduling methodology uses a novel class of online adaptive learning techniques which we refer to as staged multi-armed bandits (S-MAB). Our scheduler is able to learn online which processing method to assign to each stream and how to allocate its resources over time in order to maximize the performance on the fly, at run-time, without having access to any offline information. The proposed scheduler, applied on a face detection streaming application and without using any offline information, is able to achieve similar performance compared to an optimal semi-online solution that has full knowledge of the input stream where the differences in throughput, observed quality, resource usage and energy efficiency are less than 1%, 0.3%, 0.2% and 4% respectively

    Low Power and Scalable Many-Core Architecture for Big-Data Stream Computing

    Get PDF
    In the last years the process of examining large amounts of different types of data, or Big-Data, in an effort to uncover hidden patterns or unknown correlations has become a major need in our society. In this context, stream mining applications are now widely used in several domains such as financial analysis, video annotation, surveillance, medical services, traffic prediction, etc. In order to cope with the Big-Data stream input and its high variability, modern stream mining applications implement systems with heterogeneous classifiers and adapt online to its input data stream characteristics variation. Moreover, unlike existing architectures for video processing and compression applications, where the processing units are reconfigurable in terms of parameters and possibly even functions as the input data is changing, in Big-Data stream mining applications the complete computing pipeline is changing, as entirely new classifiers and processing functions are invoked depending on the input stream. As a result, new approaches of reconfigurable hardware platform architectures are needed to handle Big-Data streams. However, hardware solutions that have been proposed so far for stream mining applications either target high performance computing without any power consideration (i.e., limiting their applicability in small-scale computing infrastructures or current embedded systems), or they are simply dedicated to a specific learning algorithm (i.e., limited to run with a single type of classifiers). Therefore, in this paper we propose a novel low-power manycore architecture for stream mining applications that is able to cope with the dynamic data-driven nature of stream mining applications while consuming limited power. Our exploration indicates that this new proposed architecture is able to adapt to different classifiers complexities thanks to its multiple scalable vector processing units and their re-configurability feature at runtime. Moreover, our platform architecture includes a memory hierarchy optimized for Big-Data streaming and implements modern fine-grained power management techniques over all the different types of cores allowing then minimum energy consumption for each type of executed classifie

    Energy-Efficient Co-Design Optimization of Many-Core Platforms for Big-Data Streaming Applications

    No full text
    Big-Data streaming applications are used in several domains such as social media analysis, financial analysis, video annotation, surveillance, medical services and traffic prediction. These applications, running on different types of platforms from mobile devices to servers, are characterized by a highly-variable stochastic input data stream, stringent delay constraints and complex task graphs. Several software and hardware optimization techniques have been proposed to maximize the execution quality and the throughput of these applications and to minimize their energy consumption on many-core platforms. By analyzing the existing techniques, one can observe that most solutions classify as a hardware-based task-specific optimization, or as an operating system scheduler optimization, or yet as a load shedding mechanism at the application level. Each of these categories is limited in scope and can be blind to the nature of the program, the data being processed or the characteristics of the hardware. Big-Data streaming applications, due to their wide range of host hardware and content and dynamically-changing input streams, expose the fragmentation of the optimization techniques and create a clear need for a better approach. In this thesis, I propose a suite of energy-efficient hardware-software co-design techniques to bridge the gap between modern Big-Data streaming applications and existing many-core platforms. I choose to model the task graph of the class of applications I consider here by a direct acyclic graph (DAG). First, at the application layer, I propose a unified DAG monitoring solution to process online the general DAG model of the application and provide a set of relevant information that is leveraged at run time by a connected scheduler. At the operating system layer, I propose three different online scheduling solutions for many-core platforms which leverage the feedback from both the application and the hardware layers. The first scheduler addresses the problem of minimizing the energy consumption and the deadlines miss rates of multimedia applications. It takes advantage of the output of the DAG monitoring solution to adapt the mapping of the tasks of multimedia applications to the hardware according to the detected performance and targeted quality of service. The second and third schedulers address the problem of maximizing the quality and throughput and minimizing the energy consumption of Big-Data stream mining applications with single and multiple streams originating from different sources. To cope with the dynamically-changing Big-Data characteristics, the schedulers integrate machine-learning techniques to learn the environment dynamics and the application requirements and adapt the scheduling policy to the desired quality of service. I show that the proposed schedulers are able to scale the execution of data-mining applications to the system capability even in the presence of concept drift. Last, at the hardware layer, to address existing system architectures limitations and to increase even more the throughput, I propose a novel low-power many-core architecture for modern Big-Data stream mining applications that integrates a novel flexible memory hierarchy able to adapt to the dynamic data-driven nature of the input data stream

    MARKOV DECISION PROCESS BASED ENERGY-EFFICIENT SCHEDULING FOR SLICE-PARALLEL VIDEO DECODING

    Get PDF
    We consider the problem of energy-efficient scheduling for slice-parallel video decoders on multicore systems with Dynamic Voltage Frequency Scaling (DVFS) enabled processors. We rigorously formulate the problem as a Markov decision process (MDP), which simultaneously considers the on-line scheduling and per-core DVFS capabilities; the power consumption of the processor cores and caches; and the loss tolerant and dynamic nature of the video decoder. The objective is to minimize longterm power consumption subject to a minimum Quality of Service (QoS) constraint related to the decoder’s throughput. We evaluate the proposed scheduling algorithm using traces generated from a cycle-accurate multiprocessor ARM simulator. Index Terms — Slice-parallel video decoding, multicore scheduling, multicore power management, dynamic voltag

    A Unified Online Directed Acyclic Graph Flow Manager for Multicore Schedulers

    Get PDF
    Abstract — Numerous Directed-Acyclic Graph (DAG) schedulers have been developed to improve the energy efficiency of various multi-core systems. However, the DAG monitoring modules proposed by these schedulers make a priori assumptions about the workload and relationship between the task dependencies. Thus, schedulers are limited to work on a limited subset of DAG models. To address this problem, we propose a unified online DAG monitoring solution independent from the connected scheduler and able to handle all possible DAG models. Our novel low-complexity solution processes online the DAG of the application and provides relevant information about each task that can be used by any scheduler connected to it. Using H.264/AVC video decoding as an illustrative application and multiple configurations of complex synthetic DAGs, we demonstrate that our solution connected to an external simple energy-efficient scheduler is able to achieve significant improvements in energy-efficiency and deadline miss rates compared to existing approaches. I
    corecore